Spam Detection Based on Feature Evolution to Deal with Concept Drift

نویسندگان

چکیده

Electronic messages are still considered the most significant tools in business and personal applications due to their low cost easy access. However, e-mails have become a major problem owing high amount of junk mail, named spam, which fill e-mail boxes users. Several approaches been proposed detect such as filters implemented servers user-based spam message classification mechanisms. A with these is detection presence concept drift, especially result changes features over time. To overcome this problem, work proposes new system based on analyzing evolution features. The method divided into three steps: 1) model training; 2) drift detection; 3) knowledge transfer learning. first step generates models, commonly conducted machine second introduces strategy avoid drift: SFS (Similarity-based Features Se- lection) that analyzes taking account similarity obtained between feature vectors extracted from training data test data. Finally, third focuses following questions: what, how, when acquired knowledge? evaluated using two public datasets. results experiments show it possible infer threshold (drift) order ensure updated through transfer. Moreover, our anomaly able perform parallel independent tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Concept Drift Detection for Spam Email Filtering

Nowadays most of Internet users surfer from spam emails. Filtering technique is one of the effective methods which help us to get rid of the spam emails. One of the problems of filtering is that it cannot detect spam emails accurately when the concepts change or drift happens as time goes by. Therefore, it is required to handle concept drift accurately and quickly. This paper proposes a new alg...

متن کامل

Concept Drift Detection Based on Anomaly Analysis

In online machine learning, the ability to adapt to new concept quickly is highly desired. In this paper, we propose a novel concept drift detection method, which is called Anomaly Analysis Drift Detection (AADD), to improve the performance of machine learning algorithms under non-stationary environment. The proposed AADD method is based on an anomaly analysis of learner’s accuracy associate wi...

متن کامل

Tracking Concept Drift at Feature Selection Stage in SpamHunting: An Anti-spam Instance-Based Reasoning System

In this paper we propose a novel feature selection method able to handle concept drift problems in spam filtering domain. The proposed technique is applied to a previous successful instance-based reasoning e-mail filtering system called SpamHunting. Our achieved information criterion is based on several ideas extracted from the well-known information measure introduced by Shannon. We show how r...

متن کامل

Concept drift detection in business process logs using deep learning

Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...

متن کامل

Adaptive Concept Drift Detection

Concept drift is an important problem in the context of machine learning and data mining. It can be described as a change in the fundamental concepts underlying the data, or, in its most basic form, as a significant change in the distribution of the data. From a learning theoretic point of view, one can say that concept drift is a violation of the i.i.d. assumption, which states that each examp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Universal Computer Science

سال: 2021

ISSN: ['0948-695X', '0948-6968']

DOI: https://doi.org/10.3897/jucs.66284